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( 57 1 ABSTRACT 
A semiconductor random access memory chip wherein 
the cycle time is Jess than the access time for any combi- 
nation of read or write sequence. The semiconductor 
random access memory chip is partitioned into rela- 
dvely small sub-arrays with local decoding and pre- 
charging. The memory chip operates in a pipelined 
manner with more than one access propagating through 

**l ^ time and wherein the ^le time is 
limited by sub-array cycles wherein the cycle time is 
less than the access time for a memory chip having 
cycle toes greater than access times for accessed 
through the same sub-array. The memory chip also 
incorporates dynamic storage techniques for achieving 
very fast access and precharge times. 

5 Claims, 10 Drawing Sheets 
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1 2 

___ "A 32b VLSI System", Joseph W. Beyers, et al, 1982. 

SSZ^i? UCTlniE Diges, of Technical Papers. 1982, IEEE International 

HAVING IMPROVED CYCLE TIME Solid-State Circuits Conference, pages 128-129. men- 

tions that a 128 Kb RAM is pipelined 
BACKGROUND OF THE INVENTION 5 ^ 

1. Field of the Invention SUMMARY OF THE INVENTION 

The present invention relates to semiconductor static An object of the present invention is to provide a 

and dynamic memory structures and more particularly, semiconductor random access memory chip wherein 

to a pipelined semiconductor memory chip divided into ^ e cycle time is less than the access rune for any combi- 
sub-arrays having globally and locally generated decod- 10 option of read or write sequence, 

ing and locally generated precharge signals. Another object of the present invention is to provide 

2. Background Art a semiconductor random access memory chip grouped 
The present invention includes a number of particular mto a plurality of sub-arrays. 

techniques and structures which are related to general A further object of the present invention is to provide 

concepts found in the prior art. For example, the pres- a semiconductor random access memory chip that is 

ent invention employs a form of sub-array structure, partitioned into relatively small sub-arrays with local 

uses multiplexed sense amplifiers and incorporates a decoding and precharging. 

precharge technique. Soil another object of the present invention is to 

Representative prior art references which describe „ provide a semiconductor random access memory chip 

memories with sub-arrays, but not for pipelined opera- including relatively small memory sub-arrays which are 

tion, include operated in a pipelined manner witn more than one 

U.S. Pat No. 4,569,036, issued Feb. 4, 1986 to Fujii et access propagating through the chip at any given time 

al, entitled SEMICONDUCTOR DYNAMIC MEM- and wherein the cycle time is limited by sub-array cy- 

ORY DEVICE; cles. p A still further object of the present invention is to 

U.S. Pat No. 4,554,646, issued Nov. 19, 1985 to Yo- provide a semiconductor random access memory chip 

shimoto et al, entitled SEMICONDUCTOR MEM- wherein the cycle time is less man tne access time for a 

°^? r o D n VICE; memory chip naving cycle times greater than access 

U.S. Pat No. 4,542,486, issued Sept 17, 1985 to times for accesses through the same sub-array 
Anami et al, entitled SEMICONDUCTOR MEMORY 30 Still another object of the present invention is to 

D ?i^ C i? v provide a semiconductor random access memory chip 

U.S. Pat No. 4,482,984, issued Nov. 13, 1984 to incorporating dynamic storage techniques for achieving 

Ontam, entitled STATIC TYPE SEMICONDUCTOR very fast access and precharge times. 
MEMORY DEVICE; 

U.S. Pat No. 4,447,895, issued May 8, 1984 to Asano BRIEF DESCRIPTION OF THE DRAWINGS 

et ^entitled SEMICONDUCTOR MEMORY DE- FIGS. 1(a) and 1(A) is a schematic illustrations of a 

■To* o xt m^«*^* . 256K semiconductor memory chip partitioned into a 

U.S. Pat No. 4,384,347, issued May 17, 1983 to plurality of sub-arrays including bitswitches, sense am- 

DEV1CF SEMICONDUCTOR MEMORY plifiers, word line drivers and precharge circuits ac- 

r , c D^i xt » itii 1 1. • , „ 40 cording to the principles of the present invention. 

U.S. Pat No - 4,222,112, issued Sept 9, 1980 to Clem- FIG. 2 is a schematic illustration of a simplified depic- 

™5 FOP^rfr SSS^S * Jt£L ORGAN1ZA " * on of a conventional 64K semiconductor memory chip 

TION FOR REDUCING PEAK CURRENT. including a plurality of macros according to the prior 

References in the prior art directed to multiplexed art 

sen^ amplifier input •techniques include 45 FIG. 3-1 is a schematic illustration of a simplified 

U.S. Pat No 4 511,997, issued Apr. 16, 1985 to depiction of a semiconductor memory chip ncluding 

DEVOT SEMICONDUCTOR MEMORY both a local precharge/reset technique and block ad- 

lis p5 n„ awqia* ~, A -> too. dress circuitry according to the principles of the present 

U.S. Pat No. 4,509,148, issued Apr. 2, 1985 to Asano invention, 

et al entitled SEMICONDUCTOR MEMORY DE- 50 FIG. 3-2 is a schematic illustration of a simplified 

it c n * xt a a-,-, - depiction of a semiconductor memory chip similar to 

U.S. Pat No. 4,477 739, issued Oct 16. 1984 to Pro- that of FIG. 3-1 including both a local precharge/reset 

2£3o™ MOSFET RANDOM ACCESS technique and block address compare circuitry and 

tt% P*i M« AAAim • ^ w > former including a compare technique according to the 

U.S. Pat No. 4,447,893, issued May 8, 1984 to 55 principles of the present invention. 

0^^^f^^^ COimVCrrOR READ ^° S * 4 ^ 5 are illustrations of timing diagrams 

US P^o^iS^^ ,o , M useful in describing the operation of the semiconductor 

U.S. Pat No. 4 410,964, issued Oct 1 8, 1983 to Nor- memory structure of the present invention. 

£X3A?? OP HAVING A FIGS. 6 and 7(a) and 7(6) are block diagram illustra- 

r^rt 5 ?^ P ° RTS * . 60 18008 ° f 016 P iDdinc * access path of a 

descriptions of techniques usmg precharge signals semiconductor memory chip according to the princi- 
cependent upon a memory address are found in U.S. pies of the present invention. 
Pat No. 4,520,465, issued May 28, 1985 to Sood, enti- 
tled METHOD AND APPARATUS FOR SELEC- DISCLOSURE OF THE INVENTION 
S™Tp?^ G ^LUMN U^ OF A 65 Referring to FIG. 1, a schematic illustration, referred 
MEMORY ™<* U S. Pat No. 4,513,372, issued Apr. 23. to in the art as a floor plan, is shown for a 256K bit 
WW to Ziegler et al. entitled UNIVERSAL MEM- cmbodnnem of a CMOS semiconductor chip for a 

cache memory according to the present invention. 
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The particular embodiment of the 256K bit chip 
shown in FIG. 1 uses a second level metal layer to 
partition the chip into thirty-two 8K bit sub-arrays. 
Each sub-array is organized as 128 word lines by 64 
bitline pairs with 4-way bits witches and 16 resistively 
decoupled, self-timed sense amplifiers which are lo- 
cated inboard* next to the sub-array because of the use 
of a second level metal layer. The structure uses stan- 
dard CMOS memory cells composed of six devices. 



Features of the chip of FIG. 1 include a chip cycle 
time that is less than the access time, while also having 
a fast access time. This is accomplished by a number of 
techniques. 

One technique employed in FIG. 1 is that blocks in a 
critical path arc designed such that their active plus 
precbarge time is less than the access time of the chip. A 
key feature of the invention is that dynamic storage 
techniques are used to make it possible to achieve very 



The present invention may include, however, embodi- 10 fast access and precharge times. Also, a specific version 



ments using single layer metal as well as three, four or 
more metal layers. 

More specifically, the 256K bit chip structure of FIG. 
1 includes 32 sub-arrays arranged in 8 columns and 4 
rows. The abbreviations used in FIG, 
following elements. 



WL 



BL 



cs 


Chip Select Not Input 


SA 


Sense Amplifier 


BITSW 


Bitswitch 


RBITSW 


Read Bitswitch 


WBITSW 


Write Bitswitch 


RS 


Local Read BiiSwitch 




Decoder/Driver 


ws 


Local Write BiiSwitch 




Decoder/Driver 


WLDR 


Word Line Driver 


BLPC 


Bitline Precharge 


DEC 


Decoder 


DR 


Driver 


ADDR 


Address Amplifiers 


AMPS 




DI 


Data In 


DO 


Data Out 


XA 


X 'Address Input 


YA 


Y-Address Input 



of the known techniques of self-timing is used block-to- 
block and internally. 

To reduce word line delay, the chip of FIG. 1 is 
segmented into 8 local word lines with the global word 
1 refer to the 15 lines on a first level metal layer and the local word lines 
on a polycide layer. 

The delay in developing data signals on the bitlines is 

reduced by segmenting the chip into 4 rows and by 

wiring the bitlines on a second level metal layer. 
20 The block select decoders and driver circuits are 
centered to reduce metal RC delays. 

Separate read and write paths are used with the write 
bitswitches placed at the opposite ends of the bitlines 
from the read bitswitches to minimize delay for both a 
25 read and write operation. 

The 256K SRAM bit chip using the floor plan of 
FIG. 1 with sub-arrays is operated in a pipelined man- 
ner with more than one access propagating through the 
chip at any given time. In addition, the floor plan with 
30 inboard sense amplifiers is applicable to DRAM opera- 
tion with only a slight increase in access time with the 
restore portion of the cycle being hidden by tne pipe- 
lined mode of operation as will be more fully described 
relative to the DRAM embodiment of FIG. 3.2. 
35 As previously stated, in the floor plan for a 256K 
SRAM shown in FIG. 1, the chip has been partitioned 
into 32 128 WLx64 BL sub-arrays by making use of 
second layer metal. The optimum size and number of 
sub-arrays is influenced by chip access time require- 



Word Line 



Bitline 



As shown in FIG. 1, each sub-array includes a sepa- 
rate read bitswitch, write bitswitch, bitline precharge 
circuit, local word line driver and sense amplifier. Local 
word line and local read and write bitswitch decoder/- 

d rivers are associated with each of the 32 sub-arrays. X 40 ments and array utilization. The" second level of metal 
address amplifiers and Y address amplifiers are coupled 
to the word line and bitswitch decoder/drivers and 
block select decoder/drivers respectively, under con- 
trol of a clock signal generated from the Chip Select 
Not InpuL Data-in amplifiers provide inputs to each of 45 
the 32 sub-arrays under control of the clock signal and 



the write inpuL 

The sense amplifiers associated with each of the 32 
sub-arrays are connected to data output lines via data- 
out latches and off-chip drivers. 

The sub-array arrangement illustrated in the embodi- 
ment of FIG. 1 includes local decoding and p recharging 
and therefore, is operable in a pipelined manner with 
more than one access being capable of propagating 



also makes it practical to have inboard sense amplifiers 
for improvement of access time by reducing the loading 
on the output lines. Bitswitches are used so a sense 
amplifier can be shared between four bit lines, reducing 
the loading on the sense-amp set signal, compared to 
having a sense-amp for each bitline. The sense amplifi- 
ers for each sub-array are self-timed locally and totally 
self-contained. 
Each of the sub-arrays in the new floor plan is essen- 
50 dally self contained, with its own localized word line 
driver, self-timed sense amplifier circuitry and pre- 
charge circuitry. During an access only a single sub- 
array is activated. Having only a small fraction of the 
chip (1/32 for the 256K example) accessed each cycle 



through the chip at any given time. The cycle time of 55 has important ramifications for the design of a pipelined 



the chip is limited by the sub-array cycle time. FIG. 4 
illustrates what is meant by cycle time of the chip and 
access time of the chip. Thus, chip access time (T^co, 
TaC€2 etc.) is the time it takes, beginning with a given 



memory with more than one access propagating 
through the chip at a given time. 

In simplified form, a prior art memory chip consists 
of a number of blocks or macros as shown in FIG. X 



chip to be selected, for the selected chip information to 60 During an access, data simply ripples from block to 



appear at the chip output The chip cycle time is the 
selection repetition rate which indicates when, or how 
frequently a chip can be selected. Cycle time is desig- 
nated as 1,2,3 etc on the horizontal axis of FIG. 4 and 
is shown to be less than the access time. "Cycle time" 
throughout the following invention description applies 
to either the write operation or the read operation, in 
any order. 



block with one block activating the next one and a 
global reset is used. In the prior art, as illustrated in 
FIG. 2, data "ripples" because the data from the output 
of one block activates the next block Oc, well-known 
65 input triggering) but the blocks are "globally" reset by 
a signal generated by some other block which is fed 
back and resets a plurality of blocks, such as illustrated 
by the precharge reset connection in FIG. 2. 
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To achieve cycle time less than access time so the accesses, the probability of accessing one of the last 

RAM can be pipelined, a localized precharge is per- three sub-arrays accessed is 3/32. A compare on five of 

formed as shown in FIG. 3-1 as an improvement over the address bits is required each access. Thus, it is possi- 

prior art global precharge as employed in FIG. 2. An ble for a memory chip with long sub-array precharge to 

example of the localized precharge b described in the 5 operate in a pipelined mode the majority of the time 

publication by B.A. Chappell et al, in the IBM Techni- with cycle time less than access time, 

cal Disclosure Bulletin, Vol 30, No. 7, dated December The systems implication ot a pipelined memory with 

1987 entitled "Self-Timed Pulsed Wordline". Other cycle rime less than access time can be understood by 

examples of input triggered, self-resetting circuits are considering the timing diagram of FIG. 4 and FIG. 5 

referred to in the art as address-transbnon-detection 10 and the pipeline segment block diagrams of FIG. 6 and 

circuits. With the subdivided floor plan, the precharge FIG. 7. Two cases are considered. The first in FIG. 4 

signal can be generated locally and the loading on the and FIG. 6, assumes that the active plus precharge times 

precharge clock line is not large. The 256K design has of each block is less than \ the access time. Therefore, 

only 8K bits of sub-array which must be precharged the chip can be pipelined with a cycle time of \ the 

each cycle. The sub-arrays can be considered as an 15 access time for both a read and a write. No comparisons 

array of chips with only one of them being activated are needed on incoming addresses. For this case the 

each selection. The sub-arrays with their own localized bandwidth of the chip is twice what it would be for a 

word line drivers, bitswitches, self timed sense amplifi- ch i p with access time-cycle time that is not pipelined, 

ers and precharge circuits are virtually independent The second case (FIG. 5 and FIG. 7) assumes that the 

?L « « * • i * * 20 active P* 12 * precharge time of the slowest block (the 

Additionally, each or the global blocks, external to sub-array) is twice the access time and that all other 

the sub-array local circuitry, has self-timed precharge blocks are less than \ the access time. It is also assumed 

and reset circuitry. In other words, each block in the that comparisons are done on incoming address to 

critical path shown in FIG. 3-1 is switched into the check whether or not the access is to a sub-array ac- 

active state by toe previous blocks input signal, but is 25 cessed on one of the last three cycles. For the case 

returned to its precharge/standby state by self-con- where the access is not to one of these same sub-arrays 

tamed circuitry. and there is no match on the compare, the chip will run 

Being able to precharge a block very quickly after it ma pipelined mode with a cycle time of k the access 

has performed its function in anticipation of the next time. If the access is to one of these same sub-arrays, 

access is a key requirement for a memory with cycle 30 there will be a match on the incoming address and the 

tune less than access tone: The minimum time before cycle tnne will be extended. Therefore, the bandwidth 

another access can be started is the active time plus the f or this pipelined case compared to a chip that is not 

precharge tone for the slowest block in the access path. pipelined but has the same access time and a cycle time 

The sub-array precharge, because of the need to accu- Q f twice the access time is 
rately equalize the bit lines, is difficult to accomplish in 35 
a short period of time. Thus, the chip cycle time is 

limited by the sub-array cycle time. A six-device CMOS ( ( TNA ^ AC \ \ 

cell, as used in the 256K SRAM, allows the shortest BV^bw q \^\ + ^ — J x 3 I 

cycle time. 

The floor plan of FIG. 1 with inboard sense amplifi- 40 where 

ers makes it possible .to achieve almost the ^access B W 0 =band width without pipelining 

^nnw ^r^R^AM^ Random Access Ntemory (DRAM) B W=band width with pipehning * 

TZT^ x lf^ ^ "^^ ^ TNA=totaI number of accesses 

to restore the data in the accessed cell, it wfli take con- AC = accesses with compare. 

^^^t^^Tl^i^ T 1 h 45 U fc acccsses ** e ™ d ™ * nature, the bandwidth can 

In order to operate the chip of FIG. 1 m a pipelined ^ pj vcn fc v 

mode of operation for the cases where a long precharge 

is needed, initiation of another access to the chip is 

permitted as long as that access is not to the same sub- BW ~ BW ° (1 + (1 ~ ^ x 3) (2) 
array as in the last three previous accesses. As shown in 50 

FIG. 3-2, this b accomplished by comparing the sub- bw~ bw ( \ + ( NSA ~ 3 i ^ 
array selection bits with those of the previous three *~ A \ nsa ) ) 
accesses. If the previous accesses are to different sub- 
arrays resulting in a no match with the compare func- where 

don, the new access would proceed while the previ- 55 P f = probability of a compare 

ously accessed sub-array is being restored (thus "hid- NSA= number of sub-arrays. 

ing" the restore portion of the previously accessed sub- Thus for either random or sequential addresses one 
array as it is overlapped by the new access). For the should see almost a Tour times increase in bandwidth 
case where the compare found a match, the chip would compared to a conventional chip. For a DRAM, the 
go into a wait state until the sub-array precharge is 60 amount of time a chip is not available because of re- 
completed and the new access b initiated. freshing would be reduced by this same factor. 

By storing data from sequential addresses in different Thus, two approaches to the design of a pipelined 

sub-arrays, it is possible to ininimize the probability of memory chip with cycle time less than access time using 

an access to the last three accessed sub-arrays. For the a floor plan with sub-arrays have been described. The 

256K example given, there are 32 sub-arrays. Unless 65 first approach assumes the active plus precharge por- 

addresses were incremented by \ word (32) increments dons of each block b Jess than the access time. In the 

the probability of returning on successive accesses to second approach, it b assumed that the active plus pre- 

the last three sub-arrays accessed b small. For random charge portions of the sub-array block b greater than 
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the access time, and that the active plus precharge por- 
tions of the rest of the blocks is less than the access time. 
For both cases, a substantial increase in memory chip 
bandwidth is possible in memory systems using SRAM 
and DRAM chips. 5 

Having thus described our invention, what we claim 
as new, and desire to secure by Letters Patent is: 

1. A pipelined semiconductor 2 n Kbit memory chip, n 
being an integer not less than 2, said chip being seg- 
mented into a plurality of 2 n -r memory sub-arrays of 2/ JO 
Kbits arranged in columns and rows on said chip, each 
one of said 2 n ~* memory sub-arrays includes a separate 
associated word tine driver circuit means, sense ampli- 
fier circuit means and independent precharge circuit 



8 



sub-arrays including independent local decoding and 
p recharging means operate in a pipelined manner with 
greater than one access propagating through said 2" 
Kbit memory chip at one time. 

3. A pipelined semiconductor 2° Kbit memory chip 
according to claim 2 further including row and column 
address circuits disposed on said chip connected to said 
word line and bitswitch decoder/driver circuit means 
and responsive to input address access signals for select- 
ing ones of said 2" -/segmented memory sub-arrays for 
access. 

4. A pipelined semiconductor 2" Kbit memory chip 
according to claim 1 further including compare circuits 



means connected thereto, each of said independent 15 '"comparing «t least two signals for determining said 



precharge circuit means of each of said segmented 
memory sub-arrays providing local self-timed reset and 
precharge function for each segmented memory array 
independent of said other of said plurality of 2" -J mem- 
ory arrays, 

wherein said memory chip exhibits an access time t 
for providing data from said memory chip and 
wherein said local reset and precharge circuits of 
each of said segmented memory sub-arrays pro- 
vides a cycle time for each sub-array which is less 25 
than chip access time t. 

2. A pipelined semiconductor 2" Kbit memory chip 
according to claim 1 wherein said segmented memory 



sub-arrays being accessed by separate access signals. 

5. A pipelined semiconductor 2" Kbit memory chip 
according to claim 1 wherein said memory chip exhibits 
an access time t, further including global blocks on said 
2o chip external to said sub-arrays each containing clock 
circuit means, address buffer means, row decoder 
means, data output buffer means and word driver means 
associated with said plurality of sub-arrays, 
each of said global blocks including separate reset and 
precharge circuit means for providing a cycle time 
for each of said global blocks which is less than 
chip access time t. 
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